A Communication Efficient ADMM-based Distributed Algorithm Using Two-Dimensional Torus Grouping AllReduce
نویسندگان
چکیده
Abstract Large-scale distributed training mainly consists of sub-model parallel and parameter synchronization. With the expansion workers, efficiency synchronization will be affected. To tackle this problem, we first propose 2D-TGA, a g rouping A llReduce method based on two-dimensional t orus topology. This synchronizes model parameters by grouping makes full use bandwidth. Secondly, algorithm, 2D-TGA-ADMM, which combines 2D-TGA with alternating direction multipliers (ADMM). It focuses reduces wait time among workers in process. Finally, experimental results Tianhe-2 supercomputing platform show that compared $${\mathtt {MPI\_Allreduce}}$$ MPI _ Allreduce , could shorten $$33\%$$ 33 % .
منابع مشابه
Sparse Allreduce: Efficient Scalable Communication for Power-Law Data
Many large datasets exhibit power-law statistics: The web graph, social networks, text data, clickthrough data etc. Their adjacency graphs are termed natural graphs, and are known to be difficult to partition. As a consequence most distributed algorithms on these graphs are communicationintensive. Many algorithms on natural graphs involve an Allreduce: a sum or average of partitioned data which...
متن کاملEffective Design of a 3×4 Two Dimensional Distributed Amplifier Based on Gate Line Considerations
In this paper two dimensional wave propagation is used for power combining in drain nodes of a distributed amplifier (DA). The proposed two dimensional DA uses an electrical funnel to add the currents of drain nodes. The proposed structure is modified due to gate lines considerations. Total gain improvement is achieved by engineering the characteristic impedance of gate lines and also make appr...
متن کاملA hybrid token-based distributed mutual exclusion algorithm using wraparound two-dimensional array logical topology
In token-based distributed mutual exclusion algorithms a unique object (token) is used to grant the right to enter the critical section. For the movement of the token within the computer network, two possible methods can be considered: perpetual mobility of the token and token-asking method. This paper presents a distributed token-based algorithm scheduling mutually exclusive access to a critic...
متن کاملAn Efficient Permutation-Based Parallel Range-Join Algorithm on N-Dimensional Torus Computers
This paper proposes a parallel algorithm to compute the range-join of two relations on N-dimensional torus computers. The algorithm eeciently permutes all subsets of one relation to each processor in turn, where they are joined with the subset of the other relation at that processor using a local range-join algorithm. The analysis shows that the torus algorithm is more eecient than a previous a...
متن کاملAn Efficient Grouping Genetic Algorithm
Genetic algorithm is an intelligent way for solving combinatorial, NP hard problems and many other problems which cannot be easily solved by applying traditional mathematical formula. The proposed method gives a new variant of the Standard Genetic algorithm which is very simple and will easily find the solution even for complex problems. It implements the concept of grouping to reach the optima...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data Science and Engineering
سال: 2023
ISSN: ['2364-1541', '2364-1185']
DOI: https://doi.org/10.1007/s41019-022-00202-7